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Adaptive spatial binning of integral-field spectroscopic 
data using Voronoi tessellations 



cn 
o 
o 

(N 
m 



> 
(N 

\o 

(N 
(N 
O 
cn 
o 

Oh 

I ■ 

o ■ 



Michele Cappellari^* and Yannick Copin^'^ 

^ Leiden Observatory, Postbus 9513, 2300 RA Leiden, The Netherlands 
^ Institut de physique nucleaire de Lyon, 69222 Villeurbanne, France 



2 February 2008 



ABSTRACT 

We present new techniques to perform adaptive spatial binning of Integral-Field Spec- 
troscopic (IFS) data to reach a chosen constant signal-to-noise ratio per bin. These 
methods are required for the proper analysis of IFS observations, but can also be used 
for standard photometric imagery or any other two-dimensional data. Various schemes 
are tested and compared by binning and extracting the stellar kinematics of the Sa 
galaxy NGC 2273 from spectra obtained with the panoramic IFS SAURON. 
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1 INTRODUCTION 

Spatially resolved astronomical observations commonly span 
orders of magnitude variations in the signal-to-noise ratio 
(S/N) across the detector elements, and some spatial ele- 
ments are often flawed by insufficient S/N. For this reason, 
data are often locally averaged together before analyzing 
them: the resulting S/N is significantly better at the price 
of a degraded spatial resolution. 

Two kinds of averaging processes are possible: smooth- 
ing and binning. The smoothing scheme increases the lo- 
cal S/N by somehow correlating neighboring data — e.g., 
during a convolution process — while retaining the initial 
sampling, and therefore the initial number of data points. 
On the other hand, the binning technique locally groups 
and averages together data, resulting in a sparser sampling 
and a decreased number of final data points. In both cases, 
the number of independent measurements is equivalently de- 
creased. While the smoothing technique might present the 
advantage of simplicity — see e.g., the very wide use of the 
'median filtering' — it is difficult to handle the statistics of 
the correlations it introduces. As a consequence, smoothing 
is never used for the quantitative analysis of one-dimensional 
(ID) spectra. When a precise error management is needed, 
or when the actual number of data points to be treated mat- 
ters (e.g., for a heavy subsequent numerical computation), 
binning is more suited. 

Given the large variations in the S/N across the detec- 
tor elements — e.g., from the inner to the outer parts of 
a galaxy CCD image — it is of interest to use an adaptive 
binning scheme where the size of the bin is adapted to the 
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local S/N: bigger bins will be applied in the low-S/N re- 
gions, while a higher resolution — i.e., smaller bins — will 
be retained in the high-S /N parts. A well known example is 
galaxy photometry, wh ere logarithmically spaced radial bins 
are often adopted (e.e. Ijedrzeiewskilll987ll : many more pix- 
els are used to compute the value of a galaxy profile at large 
radii than in the center. However, these algorithms generally 
use an a priori knowledge of the S/N distribution (e.g. the 
surface brightness profile of the galaxy) to build the binning 
scheme. 

Binning is essential in the case of spectroscopic obser- 
vations of the stellar kinematics. In fact the extraction of 
the line-of-sight velocity distribution (LOSVD) generally in- 
volves nonlinear processes and as a result a minimum S /N is 
required for a reliable and unbiased extraction of kinomati- 
cal in formation from the spectra (e.g. Kuiikcn & Mcrrifiel^ 
I1993I and references therein). Accordingly, binning is invari- 
ably used to analyze ID (e.g. long slit) spectroscopic ob- 
servations. Developments with Integral-Field Spectroscopy 
(IFS; e.g., SAURON on WHT, VIMDS on VLT, GMDS on Gemini, 
etc.) require methods to perform spatial binning of spectra 
in two dimensions (2D) too. 

Little work has been done on the subject of fully adap- 
tive 2D-binning. Sanders & Fabian (2001) developed an al- 
gorithm for X-ray imaging data, but the bins that their 
method produces can contain other bins and are not com- 
pact. In the case of spectroscopic data it makes little sense 
to bin together spectra coming from pixels that are not close 
enough to each other and whose properties may difi'er con- 
siderably. Other schemes have to be developed and are de- 
scribed in this paper. In section |21 we discuss the specific 
requirements of the 2D-binning problem. In section |3 we 
present a solution based on the Quadtree method, while in 
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section^Jwe develop new methods based on Voronoi tessella- 
tions. In section|^we present an optimal Voronoi 2D-binning 
algorithm and finally in section |^ we draw some conclusions. 



2 FORMULATION OF THE PROBLEM 

We tackle here the problem of binning in the spatial direc- 
tion(s). In what follows the term 'pixel' refers to a given 
spatial element of the dataset (sometimes called 'spaxel' in 
the IFS community): it can be an actual pixel of a CCD 
image, or a spectrum position along the slit of a long-slit 
spectrograph or in the field of view of an IFS (e.g. a lenslet 
or a fiber). 

Each pixel i has an associated signal Si and its corre- 
sponding noise A/i. The pixel signal-to-noise ratio is there- 
fore (S/N)i — Si/Ni. Our considerations do not depend on 
the details used to estimate these quantities, which we as- 
sume to be known beforehand. In the case of spectroscopy 
for instance, the quantity Si (resp. Ni) associated with a 
given spectrum Si (A) can be the signal (resp. the noise 
Ni{X)) averaged over a given spectral range AA: 
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When two pixels are coadded the S/N of the resulting bin 
is computed according to the standard formula: 

S1+S2 



(S/N) 



1+2 = 



(2) 



It is important to remember that the term 'binning' will only 
refer here to the averaging of observations taken at different 
positions on the sky (i.e., different pixels), and not along the 
spectral direction. 

The binning method in ID is easy to implement: one 
only has to make sure that all the bins are adjacent and that 
their S/N reach a minimum value (or better that the scatter 
around the target S/N is minimum). This leads in practice 
to a unique binning solution (see a practical algorithm in 
section . In 2D (and higher dimensions) the situation is 
more complex as the shape of the bin has then to be taken 
into consideration. A good binning scheme has to satisfy the 
following requirements: 

Topological requirement: the bins should properly tes- 
sellate the relevant region of the sky plane, i.e., create a 
partition of without overlapping or holes. While this re- 
quirement is trivial to enforce in ID, it is tricky to implement 
in higher dimensions, where the bin shapes vary; 

Morphological requirement: the bin shape has to be 
as 'compact' (or 'round') as possible, so that the pixels in 
one bin are as close as possible to each other and can be 
associated with a well-defined spatial position. In this way 
the best spatial resolution is obtained along all directions; 

Uniformity requirement: the scatter in the S/N of the 
bins, around a target value, should be as small as possible. 
While a minimum S/N is generally required, one does not 
want to sacrifice spatial resolution to increase the bin S/N 
even further. 

In what follows we consider different methods and we 
apply each one to observations of the barred Sa galaxy 
NGC 2273 (Fig. [TJ taken with the panoramic IFS SAURQN 
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Figure 1. SAURDN observation of the barred Sa galaxy NGC 2273. 
Top panel: reconstructed total intensity: the bar is clearly visible. 
Contours are superimposed in 0.5 mag arcsec"'^ steps. Bottom 
panel: map of the average S/N per spectral element. Contours 
are superimposed in the range 10-50 with steps of 5. Note the 
two vertical S/N jumps close to the middle of the frame and the 
irregular boundaries due to the merging process of the different 
exposures. 



jBacon et alJ I2001I: Ide Zeeuw et alJ l2002h at the 4.2-m 
William Herschel telescope on La Palma. These observa- 
tions, carried on in March 2001, are based on two point- 
ings of 4 X 1800 s exposures. They cover a field-of-view of 
49" X 44" with an effective spatial samphng of 0"8 x O'.'S, 
over the 4810-5350 A spectral range. They have been se- 
lected for having high S/N contrast between the inner and 
the outer parts (in the range ~ 1-150) and a complex S/N- 
distribution, caused by the presence of spiral arms and the 
merging of multiple exposures (leading to S/N-jumps and 
irregular outer boundaries, see Fig. In the following ex- 
periments our goal is to bin these data with a target S /N of 
(S/N)t = 50 per spectral element on average. 
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2.1 Optimal weighting of pixels 

Binning is not magic: in 2D, as in ID, some useful informa- 
tion has to be present in the pixels for any binning method 
to be elfective. In ID one usually restricts the spatial region 
to bin, along the slit, to some range where the pixels are 
not fully dominated by instrumental noise. The same can 
be done in 2D by discarding pixels that contain virtually no 
signal from the object under examination. This is not neces- 
sary for the SAURON observations discussed here, where the 
noise in all the pixels is essentially Poissonian (A/i ~ y/Si)- 
An alternative approach is to optimally weight the pix- 
els during the summation in a bin: for a given set of pixels, 
the weights for which the sum ||5J provides the max imum 
S/N are given by Wi = S^/Uf fe.g. lRobertson l'l986h. This 
weighting makes full use of photon-noise dominated pixels 
(wi ~ 1), and automatically eliminates from the summation 
pixels that contain virtually no signal (wi ~ 0). Equation ||5J 
then becomes: 

(S/N)i+2 = V(S/N)? + (S/N)i. (3) 

All the following considerations apply unaltered irre- 
spective of whether equation ^ is used instead of equa- 
tion ||5J| for the estimation of the S/N of each bin. More- 
over the two equations automatically coincide in the limit 
of Poissonian noise and in that case no distinction needs to 
be made. However, the price to pay for the use of the opti- 
mal weighting is that the weights of each pixel contributing 
to a bin have to be recorded for a complete description of 
the bin. This complicates the quantitative interpretation of 
binned data (e.g., in dynamical modeling). 



3 QUADTREE METHOD 

It is instruct ive to first consider the Quadtree algorithm 
^Samel^ll984^ . which we believe is close to the best 'regular' 
image processing method that can be used for the present 
application. We show that the Quadtree method cannot pro- 
duce an optimal binning — that is satisfying all three pre- 
vious requirements — so that more complex schemes are 
required. These will be presented in sectional 

The Quadtree method consists of a recursive partition 
of a region of the plane into axis-aligned squares. Initially 
one square, the root, covers the entire region. Subsequently 
each square, whose S/N is higher than a given threshold, is 
divided into four equal child squares, by splitting with hori- 
zontal and vertical segments through its center. The collec- 
tion of squares then form a tree, with the root square at the 
top and with smaller squares at lower levels of the tree. 

In Fig. 121 the Quadtree method was used to rebin the 
S/N map of Fig. Q into squares satisfying a minimum S/N 
requirement. The nice feature of this binning method is that 
the resulting bins are squares of various sizes (except at the 
border). In this way bins are easy to handle, and require 
little information to be described completely. 

There are however two major shortcomings with this 
method: 

(i) a S/N spread of a factor ~ 2 is unavoidable due to the 
fact that the bin area varies by construction in steps of a 
factor of 4; 

(ii) unless the original image has a linear size, in pixels. 
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Figure 2. Top panel: Quadtree binning of NGC 2273 for a target 
(S/N)y = 50. The bins are overlaid on the S/N map. Bottom 
panel: S/N scatter in the above binning around (S/N)^ (solid 
Une); the squares represent the S/N of each bin, as a function of 
its distance from the galaxy center. The central regions {R < 4"), 
where the pixels are already above (S/N)^ and where binning is 
not needed, are not shown in this plot, as in the foUowing ones. 
The dashed lines represent the limits of the natural scatter, that 
is (S/N)t X (V2)±i. 

which is a power of two, some bins at the border will not 
be square and generally will not meet the minimum S/N 
criterion, becoming unusable for later analysis. 



4 VORONOI TESSELLATION 

Given the inability of methods employing 'regular' bins to 
produce optimal 2D tessellations, we now consider schemes 
which do not have square or rectangular bins. Accordingly 
we consider the Voronoi Tessellation (VT) that can be used 
to generate binnings satisfying all the three requirements of 
section |5| 

Given a region Q, of the plane and a set of points {zi}^i 
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of il, called generators, a VT is a partition of D, into regions 
{Vi}^i enclosing the points closer to Zi than to any other 
generator. Each Vi is referred to as the Voronoi region and 
here as bin associated with Zi (see e.g. lOkabe et al.ll200ol 
for a comprehensive treatment). 

The VT presents many interesting features for the bin- 
ning problem: 

(i) it naturally enforces the Topological requirement; 

(ii) it is efficiently described by the sole coordinates of its 
generators; 

(iii) it is very easy to implement in the discrete case: given 
the generator positions, it is sufficient to locate the closest 
generator to any given pixel to determine the bin to which 
it belongs. 

On the other hand, the fact that a VT is adopted for 
binning does not enforce by itself the Morphological require- 
ment: the bins are convex by construction, but can have very 
sharp angles. Furthermore, the Uniformity requirement is 
not addressed in any way by the simple use of a VT. These 
two shortcomings are clearly illustrated^ in Fig.|3 The Mor- 
phological and Uniformity requirements have to be tackled 
through a properly tailored distribution of the Voronoi gen- 
erators. We present now a way to produce such a distribu- 
tion. 



4.1 Centroidal Voronoi tessellation 

In the two important cases where the noise is Poissonian 
or when the pixels in each bin are optimally weighted (sec- 
tionim.the (S/N)gi„of one bin is computed by simply sum- 
ming the (S/N)i of the corresponding pixels (equation 
One can then define a density distribution p(r) — (S/N)^(r) 
such that the problem of binning to a constant S/N reduces 
to that of obtaining a tessellation enclosing equal mass ac- 
cording to p. The Centroidal Voronoi Tessellation (CVT) is 
a technique which can be used to generate this optimally 
regular and uniform VT in the continuous case, or in the 
limit of a large number of pixels. 

Given a density function p(r) defined in a region Q, a 
CVT of SI is a special class of VT where the generators Zi 
happen to coincide with the mass centroids 



rp(r) dr 



(4) 



of the corr esponding Voronoi regions Vj. As illustrated in the 
review bv IDu. Faber fc Gunzbureeil (|l99fll ). the CVTs are 
useful to solve a variety of mathematical problems, but can 
also be observed in many real-life natural examples (living 
cells, territories of animals, etc.). 

One of the most striking characteristics of CVT in the 
2D-case is its ability to partition a region into bins whose size 
varies as a function of the underlying density distribution, 
but whose shape tends asymptotically to a hexagonal-like 
lattice for a large number of bins. Another nice feature of 
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Figure 3. Top panel: Monte Carlo VT of NGC 2273. The prob- 
ability for a generator to be at location r is proportional to p(r), 
where the local density p is proportional to the (S/N)^ in Fig. [Tl 
This ensures that the S /N enclosed in each bin is constant on av- 
erage. The generators of the VT are indicated by the black dots. 
Bottom panel: S/N scatter in the above VT. Although the values 
fluctuate around the desired (S/N)^ = 50 (solid line), the scatter 
is large: RMS error is 26% (dashed lines). 



CVT is that a simple algo rithm exists for its practical com- 
putation: the lLlovdl (119821) method, for which the CVT is a 
fixed point. 

Although CVT bins are naturally smaller where the 
density is higher, the area — p relation of the bins is not 
such that the mass enclosed in every bin is constant: the 
CVT cannot be used directly to produce equal mass bins 
(equal S/N in the case of photon noise or with optimal pix- 
els weighting, see section r2.1ll . However, we now present a 
modification of the Lloyd algorithm allowing it to converge 
toward an equi-mass 2D-CVT. 



^ The four-coloring of Fig. |21 1^ and |H] was done by flrst com- 
puting the Delaunay triangulation (see Appendix A) from the 
VT generators and then optimally coloring the correspond- 
ing graph with the Mathe matica package Combinatorica by 
iPemmaraiu fc Skienal <2003l) . 



ID-case. It has been demonstrated that in ID, and asymp- 
totically for large numbers of bins (which in ID are inter- 
vals), the size d of the CVT bins is proportional to the one- 
thir d power of the u nderlying density at the midpoint of the 
bin (IDu et al.lll999^ : d oc p~^^^ ■ Since the mass enclosed in 
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Figure 4. The p relation of a two-dimensional CVT. Over- 
plotted is the best fitting relation A oc p~^^^ (solid line) and, 
for comparison, the two relations A oc p~^l'^''^ and A oc p~^l^'^ 
(dashed lines). The relation is a perfect straight line, with the 
exception of the largest bins, closer to the border, and the smallest 
ones, far from the continuum approximation. 



each bin is m « pd, one has m oc f?^^: the mass in the CVT 
bins is only constant for a constant density. 

However a CVT can still be used to partition a segment 
into equal mass intervals. In fact, one can derive from the 
previous relations that, if the CVT is performed on a differ- 
ent density p' = p^, the partition found induces equal-mass 
bins on the original density p: m « pd oc p p'~^^^ = 1. This 
result is not being used in ID, but can be generalized in 2D 
(or higher dimensions) to produce bins that have the nice 
regularity properties of the CVTs but, in addition, enclose 
equal mass. 

2D-case. In 2D, the so-called Gersho conjecture llGershd 
ll979l:lGrav fc Neuhofjfl ggS) can be used to predict the cor- 
responding relation between the area ^ of a CVT bin and 
the underlying density p. A consequence of this conjecture 
is the ■principle of equal partition of error. In 2D, it implies 
that, if d is the typical size of the CVT bin, pd* is roughly 
constant'^. Since A oc d^, one gets the relation A oc p~^^^. 
Although a rigorous proof of this relation is not known (Du, 
private communication), we have verified with various nu- 
merical experiments that this relation is indeed very accu- 
rately verified in practice, even with a small number of bins 
(Fig.H. 

Applying the same reasoning in 2D as in the ID-case, 
to generate equal-mass bins one has to satisfy the relation 
m « pA oc pp'^^^^ — 1, which implies p' — p^ . We have 
thus shown that a CVT of the density distribution p' = 
p generates a VT with bins that asymptotically enclose a 
constant mass according to the density p. 

Given a region Q,, a density function p and A'^ generators 
{zi}^j^, our modified Lloyd method to produce an equi-mass 
2D-CVT is thus the following: 

(i) select an initial random set {z^j^i of positions for the 



^ More generally, in nD, one has d oc p2+n 



generators: the probability for a generator to be at position 
r is proportional to p(r); 

(ii) perform a VT of SI associated with the generators 
{zijfLi: with the above initial generator distribution, the 
Voronoi regions {Viji^i contain on average a constant mass, 
but the scatter is large and the bins are generally badly 
shaped (as in Fig. I^J; 

(iii) compute the mass centroids of the {Vi}f^x according 
to the density p' = p^: these constitute now the new set 
{zi}^i of generators''; 

(iv) iterate over step |(ii)| until the coordinates of the VT 
generators don't change any more. 

Fig. 13 presents a CVT produced by applying the pre- 
vious algorithm to the density p, obtained by linearly inter- 
polating the (S/N)^ of NGC 2273 in Fig.[Tlonto a grid with 
pixel size 8x smaller than the original size. This interpola- 
tion is used here to approach the continuous case and ensure 
a better convergence of the modified Lloyd algorithm. In this 
case of a large number of pixels the bins of the VT tend to 
the theoretical hexagonal shape and adapt nicely to density 
variations and to the irregular boundary of the region. The 
scatter of the S/N is also close to optimal with RMS scatter 
of ~ 3%. 

This CVT method illustrates the goals towards which 
an optimal 2D-binning algorithm should tend, but it has 
still some practical limitations: 

(i) it generates equal-mass bins but not necessarily equal- 
S/N bins, unless e.g. the noise is Poissonian or the bins are 
optimally weighted (see the beginning of this section). 

(ii) more importantly the method does not work well 
when the bins are constituted of just a few pixels. This is due 
to the fact that the Lloyd method is designed to work in a 
continuum approximation and does not necessarily converge 
to the global minimum in a d iscrete case, un less a good set 
of initial generators is chosen (IDu et alJI 19991) . In Fig.Elthe 
same generators as in Fig. |S] where used to construct a VT 
for the coarser SAURON pixel grid of Fig. The obtained VT 
is similar to that of Fig. |3 but the S /N scatter increases 
considerably, due to discretization effects. Constructing the 
bins from an oversampled and interpolated grid is not an 
acceptable solution if an accurate solution is desired, as this 
would introduce correlations within the bins. One wants to 
construct the bins by coadding actually observed pixels. 

In practice an optimal 2D-binning method has to pre- 
serve the good characteristics of the equi-mass 2D-CVT, in 
the limit of many pixels, but has to be able to take the dis- 
crete nature of pixels into account, when dealing with bins 
constituted by just a few pixels. This algorithm is the sub- 
ject of the next section. 

4.2 Bin-accretion algorithm 

Finding a good set of initial generators is crucial for the 
Lloyd algorithm to converge to the global minimum, when- 
ever the discrete nature of pixels is important. In this section 
we describe a method we developed to find the generators 

^ The original Lloyd method uses p' = p and thus does not en- 
forces equal mass bins. 
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Figure 5. Continuous case. Top panel: CVT-binning of the 'con- 
tinuous' density p'(r) = p^{r), where p is obtained by linear inter- 
polation from the (S/N)-^ in Fig. The generators of the CVT 
are indicated by the black dots. Bottom panel: S/N scatter in 
the above CVT. The S/N of the original pixels is shown with 
the crosses, while the squares represent the S/N of the final bins. 
The target (S/N)t = 50 is indicated by the soUd line. The RMS 
scatter is ~ 3% (dashed lines). 



for the optimal VT taking the discrete nature of pixels into 
account from the beginning. The algorithm described here 
constructs an initial binning trying to generalize to 2D the 
standard pixel- by-pixel binning algorithm used in ID. The 
centroids of the bins found in this way are then used as 
starting generators for a CVT. The method reduces to the 
previous CVT in the limit of many pixels, but works on a 
pixel basis with bins made by just a few pixels. 

A natural ID-binning algorithm proceeds as follows: 
start a bin from the highest S/N unbinned pixel and ac- 
crete neighboring pixels; when the target S/N is reached a 
new bin is started and this process is repeated until all pix- 
els have been binned. To extend this idea to 2D, we need 
to make a clever choice regarding the pixel to be accreted 
next, so that the topological and morphological constraints 
are naturally enforced. The adopted method always tries to 
add to the current bin the pixel that is closest to the bin 
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Figure 6. Discrete case. Top panel: VT of the SAURON pixel grid 
of Fig. obtained using the same generators (black dots) as in 
Fig. 121 Bottom panel: S/N scatter in the above VT, as a func- 
tion of the distance from the galaxy center. Note the significant 
increase of the scatter (RMS of ~ 14%, dashed lines) around 
(S/N)^ (solid line), compared to Fig. 151 due to discretization ef- 
fects. 



centroid. Furthermore, when a new bin is started, the first 
pixel is also selected as the one closest to the centroid of 
all the previously binned pixels. This simple scheme auto- 
matically tends to generate bins that are compact, and the 
bin S /N can be carefully monitored on a pixel-by-pixel basis 
during the accretion phase. The centroids of the bins found 
in this way can be used as starting generators for a CVT. 

The idea described above is quite simple, but some prac- 
tical problems have to be solved for a robust implementation 
of the algorithm. During the accretion of new bins around 
the bin centroid, the topological and morphological criteria 
tend to be satisfied by construction, but there can be situa- 
tions where this is not the case: a common example is when 
the accretion process hits the boundary of the dataset. In 
this situation, as new pixels are accreted, the shape of the 
growing bin will be dictated more by the shape of the bor- 
der of the dataset than by the centroid criterion. As a result 
some bins can become very elongated or even disconnected 
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before the target S/N is reached. If this is not prevented, 
the centroids of the final bins will not always fall close to 
the bin center and can even lie outside the bin. In this case 
the bin centroids will not provide a good set of initial gen- 
erators for the CVT and the modified Lloyd algorithm may 
not converge to an acceptable solution (e.g. some bins may 
not reach (S/N)t). 

To solve this situation the simple idea of the bin- 
accretion algorithm has to be complicated a little by making 
sure that the topological and morphological criteria are en- 
forced for all bins, and not only for most of them. In practice 
the shape of the bin is monitored during the accretion phase 
and, if the bin does not satisfy some minimal requirements, a 
new bin is initiated even if the previous bin did not meet the 
minimum S/N criterion. The bins that do not have enough 
S/N at the end of the accretion stage will be reassigned to 
the closest good bin, before computing the centroids which 
will be used as starting points of the final CVT. 



5 OPTIMAL VORONOI 2D-BINNING 
5.1 Algorithm 

In this section we translate the ideas illustrated in section^] 
into a practical and robust algorithm that can be used to bin 
actual 2D data. Our optimal Voronoi 2D-binning algorithm 
is described in detail by the following steps: 

(i) start the first bin from the highest S/N pixel of the 
image; 

(ii) evaluate the mass centroid of the current bin and se- 
lect the unbinned pixel closest to the centroid as candidate 
for addition to the bin; 

(iii) check whether all the following conditions are satis- 
fied in that precise order: 

(a) topological criterion: the new pixel is adjacent to 
the current bin; 

(b) morphological criterion: by adding the new pixel, 
the 'roundness' TZ (see below) of the current bin would 
remain below a chosen threshold; 

(c) uniformity criterion: by adding the new pixel, the 
bin S/N would not deviate from (S/N)t more than before 
the addition while the S /N is already higher than a given 
fraction (e.g., 80%) of the target S/N; 

If all the previous criteria are fulfilled, the accretion process 
can go on: add the candidate pixel to the current bin and 
go back to step |(ii)| 

(iv) the accretion process of the current bin has come to 
an end. If the uniformity criterion |(iii)(c)| was met, mark all 
the pixels in the bin as 'successfully binned', otherwise as 
'unsuccessfully binned'; 

(v) evaluate the mass centroid of all the pixels already 
binned and start a new bin from the unbinned pixel closest 
to this centroid; go back to step |(ii)| until all pixels have been 
binned; 

(vi) compute the centroid of each successful bin and reas- 
sign the 'unsuccessfully binned' pixels to the closest of these 
centroids; 

(vii) recompute the centroid of each bin obtained from 
the previous step: these centroids are then used as initial 



generators of a CVT, which is computed with the modified 
Lloyd method of section im 

The quantity TZ used in step |(iii)(b)| is any quantity 
measuring the 'roundness' of a bin. It can be simply defined 
as: 



where rmax is the maximum distance between the centroid 
of the bin and any of the bin pixels, and r-cff is the radius of 
a disk of same area as the whole bin. With this definition, 
TZ — for a perfectly circular bin and TZ > otherwise. We 
found that a roundness threshold TZmax ~ 0.3 gives good 
results. 

Fig. |7| provides a visual explanation of the four major 
stages of the above algorithm. For illustration purposes, it 
has been applied to a 'continuous' density distribution, ob- 
tained by linearly resampling the (S/N)'^ of Fig. 0onto a 
regular 8x finer grid. After a bin-accretion stage, the un- 
successfully binned pixels are fiagged and reassigned to the 
closest successful bin, and a VT is generated from the cen- 
troids of the bins. 



5.2 Application to real data 

An example of the application of our optimal Voronoi 2D- 
binning algorithm to the actual SAURON data of NGC 2273 
is shown in the top panel of Fig. |H1 while the resulting S /N 
scatter is shown in the bottom panel: the RMS value is 
~ 6%. The S/N values, symmetrically clustered around the 
target (S/N)t ~ 50, essentially represent the lowest S/N 
scatter obtainable from a binning of these data: all the scat- 
ter is due to discretization noise, which increases towards the 
galaxy center, where the bins are made of a smaller number 
of pixels. This figure shows that the method is able to guar- 
antee a minimum S/N to all bins, while retaining a regular 
shape. 

The binning resulting from this algorithm is qualita- 
tively similar to the one obtained using the CVT (see Fig.inj. 
But, by contrast to the CVT alone, this method is able to 
produce bins that are essentially optimal also in the 'small- 
bins' regime, with bins of only 2-4 pixels. 

The stellar mean velocity and velocity dispersion fields 
extracted from the binned data are shown in Fig. ^ and 
Fig. 1101 The spectra obtained with SAURON include promi- 
nent gas emission lines of II/3, [O III]AA4959, 5007 and 
[N I] A5200. The kinematics extraction was perform ed in 
pixel-space using the method o f ICappellari et alj fe0021. and 
adopting a single Kl III template star observed with SAURON. 
The spectral regions possibly contaminated by gas emission 
lines were excluded from the fit in all spectra. Not surpris- 
ingly, while for the unbinned data some information can be 
extracted only from the galaxy central regions, where the 
pixels S/N is naturally high enough, the use of binned data 
leads to consistent measurements over the whole observed 
field. 

The kinematics maps in Fig. |^ and 1101 were linearly 
interpolated using a Delaunay triangulation of the mea- 
sured values, which were assigned to the position of the flux 
weighted centroid of each bin. If the underlying kinemat- 
ics varies smoothly over the scale of the bins, and if the 
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Figure 7. Four main stages of the optimal Voronoi 2D-binning 
algoritlim are illustrated in the continuous case. From top to bot- 
tom; (i) construction of the bins by 'bin-accretion', (ii) the bins 
that did not satisfy the convergence condition are flagged (black) , 
(iii) and their pixels are reassigned to the closest good bin, (iv) 
the centroid of the new bins are evaluated and a VT is constructed 
using these as generators (crosses). 
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Figure 8. Top panel: Final result for the bins after application 
of the optimal Voronoi 2D-binning algorithm to the SAURQN ob- 
servations of NGC 2273. Bottom panel: S/N scatter in the above 
binning. The solid line represents the target S/N level, while the 
two dashed lines show the RMS scatter ('-^ 6%). Note the decrease 
of the S/N scatter compared to Fig.lHl 



measurements errors are negligible, this approach closely re- 
covers the 'true' kinematical field. Using an interpolation it 
becomes easy to visually detect regularities in the data, like 
the almost perfect point-symmetry of the velocity field in 
the bottom panel of Fig. |^ 

In general however, with non-negligible errors, any visu- 
alization scheme that tries to reproduce the measurements 
exactly creates artifacts whose reality is difficult to judge. 
The problem lies in the fact that in 2D kinematics maps 
there is no standard way to visualize error, which one is used 
to see in ID kinematics profiles. For this reason it becomes 
difficult to distinguish noise fluctuations from real features 
in the data. A solution would be to use standard non- para- 
metric methods, li ke 'smoothin g splines ' IWhaba 19901 or 
'kernel estimators' (IScottlll992h . to recover the kinematical 
field and display the maps, taking measurements errors into 
account. This assumes of course that the intrinsic 2D field 
under study is smooth on the scale of a bin. If this is not the 
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Figure 9. The stellar mean velocity V field measured from the 
unbinned SAURDN data of NGC 2273 {top panel), is compared 
to the linearly interpolated velocity field extracted from the 2D- 
binned spectra {bottom panel). The same color levels are used in 
both plots. The flux weighted centroids {not the VT generators) 
of the bins used (Fig. |2j are indicated by the black dots. Some 
representative contours of the galaxy surface brightness are also 
shown. The interpolated binned velocity field was truncated to 
the centroids of the outer bins. 



case, there is no way one can tell it from the data; binning 
is in fact only applied when noise dominates over the small 
spatial scales, and the fine details are already lost. 

The recovery of the intrinsic velocity field is however 
beyond the scope of the present paper. In 2D like in ID, 
binning is generally simply used to obtain a reliable measure 
of some quantity inside a given spatial region. The measure 
is then compared with models (e.g. dynamical ones), which 
are evaluated exactly inside the same spatial region as the 
real observation. As long as the assumption in the model are 
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Figure 10. Same as in Fig. |5] for the stellar velocity dispersion 
<T field. Very little information on the dispersion field can be ex- 
tracted without binning. 

correct, and model and data are treated in the same way, 
one does not have to worry about what happens inside each 
bin. 

Similar tests of binning IFS data were performed on 
a number of objects for which data obtained with SAURON 
were available. In addition we also performed 2D binning of 
galaxy images. The results presented here for NGC 2273 are 
representative of all the cases we tried. 

5.3 Availability 

Software implementing in IDL** the method described in this 
section is available from 

* http://www.rsinc.com/ 
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:/ /www.strw.leidenuniv.nl/~mcappell/idl/' 
In addition to the desired target S/N, only four columns of 
numbers are required as input: the coordinates {xi,yi) of 
each pixel and the corresponding signal Si and noise Ni . In 
output to each pixel a bin number will be assigned. 

The above IDL implementation took 3.5 s on a 1 GHz 
PC to generate the 2D-binning shown in Fig. |H| which is 
composed of 3107 pixels grouped into 335 bins. The compu- 
tation time scales roughly with the number of pixels to bin. 
The memory needed by the algorithm is comparable to that 
required to store the pixels themselves. 



6 CONCLUSIONS 

The problem of adaptively binning 2D data (e.g. spatial ele- 
ments of IFS or imaging data) to a constant S/N per bin has 
been analyzed in detail. The goals towards which an optimal 
algorithm should tend have been presented, and the limita- 
tions of different approaches, making innovative use of the 
Voronoi tessellations, have been discussed. Finally a robust 
algorithm that solves the 2D-binning problem in an optimal 
way has been described. The different methods have been 
applied to the binning and to the extraction of the stellar 
kinematics from the SAURON data of the barred Sa galaxy 
NGC 2273. 

Binning is invariably used to analyze ID (e.g. long-slit) 
spectroscopic observations. We believe adaptive 2D-binning 
should become common practice for the analysis of 2D data 
too (in particular spectral data). This method is being used 
systematically in the analysis of the IFS data obtained by 
the SAURON project. 

Our binning methods can be naturally extended to 
three dimensions. The CVT binning described here can also 
be used with discrete datasets, like e.g. stellar proper mo- 
tions observations for globular clusters, or datasets coming 
from N-body simulations (e.g. .Schaap fc van de Weveaerti 
I2OOOI) . or X-ray data. 
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APPENDIX A: CONNECTION WITH THE 
MESH GENERATION PROBLEM 

Closely related to the VT, is the Delaunay triangulation (in 
2D), that can be obtained from the tessellation by connect- 
ing each generator to the generators of the adjacent Voronoi 
regions. The Delaunay triangulations are widely used in var- 
ious applications such as Finite Elements methods and in- 
terpolation schemes, due to the fact that, once the set of 
generators has been defined, they optimize the geometric 
quality of the resulting triangulation. 

A complementary way of looking at the regularity of the 
grid obtained with the CVT consists of constructing the De- 
launay triangulation of the generators found by Lloyd's al- 
gorithm. This triangulation can be easily computed by con- 
necting with each other all generators of adjacent bins and 
is represented in Fig. \^ for the CVT of Fig. The tri- 
angulation obtained is composed of triangles whose shape 
is asymptotically close to equilateral, and indeed it is easy 
to understand that hexagonal Voronoi regions correspond to 
equilateral triangles of the Delaunay triangulation computed 
from the same set of generators. 

This fact makes obvious the close connection, in the con- 
tinuum case, between the problem of generating an optimal 
adaptive 2D-binning of a given region and that of construct- 
ing an optimal triangulation of the same region. These are in 
fact just two different ways of looking at the same problem. 
Ignoring boundary effects, CVT are an extremely simple but 
powerful tool for the construction of an unstructured mesh, 
and the obtained triangulations are of quality comparable 
to the most advanced and complex mesh generation meth- 
ods. The A — p relation between the size of CVT bins and 
the density, which derives from Gersho conjecture, makes it 
possible to adapt the size of the mesh to any desired distri- 
bution, while still preserving the equilateral characteristic of 
the generated mesh. 
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Figure Al. Dclaunay triangulation computed from the genera- 
tors of the VT shown in Fig. |5] 



